Active Learning for Mention Detection: A Comparison of Sentence Selection Strategies

نویسندگان

  • Nitin Madnani
  • Hongyan Jing
  • Nanda Kambhatla
  • Salim Roukos
چکیده

We propose and compare various sentence selection strategies for active learning for the task of detecting mentions of entities. The best strategy employs the sum of con dences of two statistical classi ers trained on di erent views of the data. Our experimental results show that, compared to the random selection strategy, this strategy reduces the amount of required labeled training data by over 50% while achieving the same performance. The e ect is even more signi cant when only named mentions are considered: the system achieves the same performance by using only 42% of the training data required by the random selection strategy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iranian EFL Learners’ Lexical Inferencing Strategies at Both Text and Sentence levels

Lexical inferencing is one of the most important strategies in vocabulary learning and it plays an important role in dealing with unknown words in a text. In this regard, the aim of this study was to determine the lexical inferencing strategies used by Iranian EFL learners when they encounter unknown words at both text and sentence levels. To this end, forty lower intermediate students were div...

متن کامل

Sentence Processing Among Native vs. Nonnative Speakers: Implications for Critical Period Hypothesis

The present study intended to investigate the processing behavior of 2 groups of L2 learners of English (high and mid in proficiency) and a group of English native speakers on English active and passive reduced relative clauses. Three sets of tasks, an offline task, and 2 online tasks were conducted. Results revealed that the high-proficiency group’s performance was the same as that of the nati...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

Sentence-Length Informed Method for Active Learning Based Resource-Poor Statistical Machine Translation

This paper presents a simple but effective sentence-length informed method to select informative sentences for active learning (AL) based SMT. A length factor is introduced to penalize short sentences to balance the “exploration” and “exploitation” problem. The penalty is dynamically updated at each iteration of sentence selection by the ratio of the current candidate sentence length and the ov...

متن کامل

Enhancing Active Learning for Semantic Role Labeling via Compressed Dependency Trees

This paper explores new approaches to active learning (AL) for semantic role labeling (SRL), focusing in particular on combining typical informativity-based sampling strategies with a novel measure of representativeness based on compressed dependency trees (CDTs). In essence, the compressed representation encodes the target predicate and the key dependents of the verb complex in the sentence. W...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0911.1965  شماره 

صفحات  -

تاریخ انتشار 2009